Document Retrieval , Automatic 1 Elizabeth D . Liddy Center for Natural Language Processing School of Information Studies
نویسنده
چکیده
Document Retrieval is the computerized process of producing a relevance ranked list of documents in response to an inquirer’s request by comparing their request to an automatically produced index of the documents in the system. Everyone uses such systems today in the form of web-based search engines. While evolving from a fairly small discipline in the 1940s, to a large, profitable industry today, the field has maintained a healthy research focus, supported by test collections and large-scale annual comparative tests of systems. A document retrieval system is comprised of three core modules: document processor, query analyzer, and matching function. There are several theoretical models on which document retrieval systems are based: Boolean, Vector Space, Probabilistic, and Language Model.
منابع مشابه
Document Retrieval, Automatic
Document Retrieval is the computerized process of producing a relevance ranked list of documents in response to an inquirer’s request by comparing their request to an automatically produced index of the documents in the system. Everyone uses such systems today in the form of web-based search engines. While evolving from a fairly small discipline in the 1940s, to a large, profitable industry tod...
متن کاملMultiple And Single Document Summarization Using DR-LINK
Our Tipster Phase III research objective for the Summarization task is to produce a single summary across multiple documents returned from a search on an information retrieval system. An established set of metrics to evaluate the performance of our system is not available in this field at present, so this research is also developing a procedure to evaluate the summaries we create. We hope to un...
متن کاملIlluminating Trouble Tickets with Sublanguage Theory
A study was conducted to explore the potential of Natural Language Processing (NLP)based knowledge discovery approaches for the task of representing and exploiting the vital information contained in field service (trouble) tickets for a large utility provider. Analysis of a subset of tickets, guided by sublanguage theory, identified linguistic patterns, which were translated into rule-based alg...
متن کاملNATURAL LANGUAGE PRocEssiNG FOR INFORMATION RETRIEVAL AND KNOWLEDGE DISCOVERY
Natural Language Processing (NLP) is a powerful technology for the vital tasks of information retrieval (IR) and knowledge discovery (KD) which, in turn, feed the visualization systems of the present and future and enable knowledge workers to focus more of their time on the vital tasks of analysis and prediction. First, a definition of NLP. Natural language processing is a set of computational ...
متن کاملDR-LINK: A System Update for TREC-2
The theoretical goal underlying the DR-LINK System is to represent and match documents and queries at the various linguistic levels at which human language conveys meaning. Accordingly, we have developed a modular system which processes and represents text at the lexical, syntactic, semantic, and discourse levels of language. In concert, these levels of processing permit DR-LINK to achieve a le...
متن کامل